Feature Selection Based on Relative Attribute Dependency: An Experimental Study

نویسندگان

  • Jianchao Han
  • Ricardo Sanchez
  • Xiaohua Hu
چکیده

Most existing rough set-based feature selection algorithms suffer from intensive computation of either discernibility functions or positive regions to find attribute reduct. In this paper, we develop a new computation model based on relative attribute dependency defined as the proportion of the projection of the decision table on a condition attributes subset to the projection of the decision table on the union of the condition attributes subset and the decision attributes set. To find an optimal reduct, we use information entropy conveyed by the attributes as the heuristic. A novel algorithm to find optimal reducts of condition attributes based on the relative attribute dependency is implemented using Java, and is experimented with 10 data sets from UCI Machine Learning Repository. We conduct the comparison of data classification using C4.5 with the original data sets and their reducts. The experiment results demonstrate the usefulness of our algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Consistency Based Attribute Reduction

Rough sets are widely used in feature subset selection and attribute reduction. In most of the existing algorithms, the dependency function is employed to evaluate the quality of a feature subset. The disadvantages of using dependency are discussed in this paper. And the problem of forward greedy search algorithm based on dependency is presented. We introduce the consistency measure to deal wit...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

Extended MULTIMOORA method based on Shannon entropy weight for materials selection

Selection of appropriate material is a crucial step in engineering design and manufacturing process. Without a systematic technique, many useful engineering materials may be ignored for selection. The category of multiple attribute decision-making (MADM) methods is an effective set of structured techniques. Having uncomplicated assumptions and mathematics, the MULTIMOORA method as an MADM appro...

متن کامل

Attribute selection with fuzzy decision reducts

Rough set theory provides a methodology for data analysis based on the approximation of concepts in information systems. It revolves around the notion of discernibility: the ability to distinguish between objects, based on their attribute values. It allows to infer data dependencies that are useful in the fields of feature selection and decision model construction. In many cases, however, it is...

متن کامل

Rough Sets and Confidence Attribute Bagging for Chinese Architectural Document Categorization

Aiming at the problems of the traditional feature selection methods that threshold filtering loses a lot of effective architectural information and the shortcoming of Bagging algorithm that weaker classifiers of Bagging have the same weights to improve the performance of Chinese architectural document categorization, a new algorithm based on Rough set and Confidence Attribute Bagging is propose...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005